{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# SHAP vs EBM Comparison\n",
    "\n",
    "SHAP is the most popular blackbox explanation method available today, so it’s natural to ask what the similarities and differences are between SHAP and EBM explanations. Let’s find out!\n",
    "\n",
    "One reason you might prefer a glassbox Explainable Boosting Machine (EBM) model over a blackbox model + SHAP explanations is that the glassbox EBM provides its own direct explanations, and those explanations are always exact. SHAP does a great job at decomposing a complex model into independent feature importances, but that process loses some of the information present in the complex blackbox model. SHAP explanations, while usually informative, are only approximate explanations of an inherently more complex blackbox model. This approximation aspect is something that would be shared with any explanation method that attempts to generate explanations as independent feature components.\n",
    "\n",
    "Let's look at some examples to illustrate. We'll start by training a simple GAM (generalized additive model) and applying SHAP to it. For this initial example we would expect the EBM and SHAP explanations to be very similar since the GAM model can be decomposed into independent feature aspects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# boilerplate\n",
    "\n",
    "from interpret import show\n",
    "from interpret.glassbox import ExplainableBoostingRegressor\n",
    "from interpret.blackbox import ShapKernel\n",
    "import numpy as np\n",
    "from sklearn.metrics import mean_squared_error\n",
    "\n",
    "from interpret import set_visualize_provider\n",
    "from interpret.provider import InlineProvider\n",
    "set_visualize_provider(InlineProvider())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# make a synthetic dataset with 2 features and 80 samples\n",
    "X = np.repeat([[1, 1], [-1, -1], [-1, 1], [1, -1]], 20, axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# y_simple can be predicted with a simple generalized additive model\n",
    "y_simple = X[:, 0] * 3 + X[:, 1] * 3\n",
    "# y_simple == np.repeat([6, -6, 0, 0], 20)\n",
    "\n",
    "ebm_simple = ExplainableBoostingRegressor()\n",
    "ebm_simple.fit(X, y_simple)\n",
    "\n",
    "# low MSE indicates the model is a good fit\n",
    "print(mean_squared_error(y_simple, ebm_simple.predict(X)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# generate an EBM explanation and show the first sample\n",
    "\n",
    "show(ebm_simple.explain_local(X, y_simple), 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br/>\n",
    "<br/>\n",
    "<br/>\n",
    "<br/>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# generate a SHAP explanation and show the first sample\n",
    "\n",
    "shap_simple = ShapKernel(ebm_simple.predict, X)\n",
    "show(shap_simple.explain_local(X, y_simple), 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As expected, the EBM and SHAP explanations of the GAM are nearly identical.\n",
    "\n",
    "Let's now construct an extreme dataset that consists entirely of pairwise interaction --- no main effects, only a single pairwise interaction. Traditional GAMs cannot fit a useful model on this dataset, but EBMs can because they include pairwise interactions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# y_pairwise requires an interaction term to predict\n",
    "y_pairwise = np.where((X[:, 0] > 0) ^ (X[:, 1] > 0), -6, 6)\n",
    "# y_pairwise == np.repeat([6, 6, -6, -6], 20)\n",
    "\n",
    "ebm_pairwise = ExplainableBoostingRegressor()\n",
    "ebm_pairwise.fit(X, y_pairwise)\n",
    "\n",
    "# low MSE indicates the model is a good fit\n",
    "print(mean_squared_error(y_pairwise, ebm_pairwise.predict(X)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# generate an EBM explanation and show the first sample\n",
    "\n",
    "show(ebm_pairwise.explain_local(X, y_pairwise), 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<br/>\n",
    "<br/>\n",
    "<br/>\n",
    "<br/>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# generate a SHAP explanation and show the first sample\n",
    "\n",
    "shap_pairwise = ShapKernel(ebm_pairwise.predict, X)\n",
    "show(shap_pairwise.explain_local(X, y_pairwise), 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The explanations are clearly different now. In this example, SHAP does not have an innate understanding of the pairwise term, so it's mapping the importance of the EBM's pair onto the two individual features. Spreading the importance this way is the best that any method could do if the importance must be expressed on a per-feature basis.\n",
    "\n",
    "It should also be noted that the SHAP explanations for both the simple GAM and pairwise models are identical, even though the data and models are very different. The UI for the local SHAP explanation is currently positioned on the first sample's local explanation, but this would also apply for any sample. The dropdown in the UI is live, so the explanations for the other samples can also be viewed.\n",
    "\n",
    "We have trained two very different models that produce different outputs. If given only the SHAP explanations though, it would be impossible to differentiate between the two models. The SHAP values would also remain identical if we blended these two models together in any proportion. Most blackbox models will not be composed of pure interaction effect as this one is, but most will have at least some interaction effects present in the model.\n",
    "\n",
    "The examples above used Kernel SHAP. Unlike Kernel SHAP, Tree SHAP is known to produce exact SHAP values. Many people mistakenly believe that this means Tree SHAP produces exact explanations. In reality, Tree SHAP produces exact SHAP values, but exact SHAP values are still only approximate explanations.\n",
    "\n",
    "Decision trees are another form of glassbox model. Similarly to EBMs, they also have the ability to learn pairwise interactions. Let's see how Tree SHAP handles a decision tree trained on y_pairwise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from interpret.glassbox import RegressionTree\n",
    "\n",
    "tree = RegressionTree()\n",
    "tree.fit(X, y_pairwise)\n",
    "\n",
    "# low MSE indicates the model is a good fit\n",
    "print(mean_squared_error(y_pairwise, tree.predict(X)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "show(tree.explain_global())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that the decision tree has learned the same XOR pairwise interaction as the pairwise EBM, so it should produce nearly identical predictions. We can verify this by comparing the predictions of the tree model to the predictions of the EBMs trained with and without interactions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(mean_squared_error(ebm_pairwise.predict(X), tree.predict(X)))\n",
    "print(mean_squared_error(ebm_simple.predict(X), tree.predict(X)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# generate a Tree SHAP explanation and show the first sample\n",
    "\n",
    "from interpret.greybox import ShapTree\n",
    "\n",
    "tree_shap = ShapTree(tree.sk_model_, X)\n",
    "show(tree_shap.explain_local(X, y_pairwise), 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, Tree SHAP is returning almost identical explanations as Kernel SHAP. In the case of Tree SHAP, the feature importances are both exactly 3.0, while Kernel SHAP returned estimated feature importances of 2.99. The pairwise detail is still lost however in the Tree SHAP explanations, which makes these Tree SHAP explanations also only approximate explanations.\n",
    "\n",
    "It should be mentioned that the SHAP package has some optional functionality to handle interaction terms. Simpler models like an EBM that have a limited number of pairwise terms could have all the possible pairwise SHAP values calculated. In terms of getting exact explanations though, this method only works in practice for what are already glassbox models. Even simple blackbox models tend to have large numbers of much higher dimensional terms expressed in the model. The default XGBoost model is built with max_depth=6 and n_estimators=100, which means that there are 2^5 * 100 = 3,200 potential 6-way interactions inside that model. The model would also have an even larger number of 5-ways, 4-ways, 3-ways, and pairs that came from the purification distillates of the 6-way interactions. In theory all of these interactions could have SHAP values calculated for them, but such a complex explanation would not be understandable by humans. It would also take far too long to compute these SHAP values except on trivial datasets. EBMs avoid these issues by forcing the model to learn as much as possible from the individual features and a curated set of the most important interactions.\n",
    "\n",
    "For more information about SHAP interaction terms see [Basic SHAP Interaction Value Example in XGBoost](https://github.com/slundberg/shap/blob/master/notebooks/tabular_examples/tree_based_models/Basic%20SHAP%20Interaction%20Value%20Example%20in%20XGBoost.ipynb)"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
